132 PART 3 Getting Down and Dirty with Data
Taking sides with confidence intervals
As demonstrated in the simulation described in the sidebar “Feel Confident: Don’t
Live on an Island!”, 95 percent CIs contain the true population value 95 percent of
the time, and fail to contain the true value the other 5 percent of the time. Usually,
95 percent confidence limits are calculated to be balanced, so that the 5 percent
failures are split evenly. This means that the true population parameter is actually
less than the lower confidence limit 2.5 percent of the time, and it is actually
greater than the upper confidence limit 2.5 percent of the time. This is called a
two-sided, balanced CI.
FEEL CONFIDENT: DON’T
LIVE ON AN ISLAND!
Imagine that you enroll a sample of participants from some defined population in a
study and obtain a sample statistic. As an example, you calculate a mean blood glucose
level from a sample of 50 adult diabetics representing the background population of all
adult diabetics. Assume you calculate a 95 percent CI around this statistic, and then you
assert that you are 95 percent confident that your CI contains the true population value.
But what does that even mean? How can anyone be 95 percent confident? What does
that feel like?
There is a popular simulation to illustrate the interpretation of CIs and help learners
understand what it is like to be 95 percent confident. Imagine that you have a Microsoft
Excel spreadsheet, and you make up an entire population of 100 adult diabetics (maybe
they live on an island?). You make up a blood glucose measurement for each of them
and type it into the spreadsheet. Then, when you take the average of this entire column,
you get the true population parameter (in our simulation). Next, randomly choose a
sample of 50 measurements from your population of 100, and calculate a sample mean
and a 95 percent CI. Your sample mean will probably be different than the population
parameter, but that’s okay — that’s just sampling error.
Here’s where the simulation gets hard. You actually have to take 100 samples of 50. For
each sample, you need to calculate the mean and 95 percent CI. You may find yourself
making a list of the means and CIs from your 100 samples on a different tab in the
spreadsheet. Once you are done with that part, go back and refresh your memory as to
what the original population parameter really is. Get that number, then review all 100
CIs you calculated from all 100 samples of 50 you took from your imaginary population.
Because you made 95 percent CIs, 95 out of your 100 CIs will contain the true popula-
tion parameter (and 5 of them won’t)! This simulation is a way of demonstrating a proof
of the central limit theorem (CLT), and helps learners understand what it means to be
95 percent confident about their CI.